Skip to content

feat: Add tell() to OutputStream writers#2998

Merged
kevinjqliu merged 1 commit intoapache:mainfrom
geruh:tellme
Feb 2, 2026
Merged

feat: Add tell() to OutputStream writers#2998
kevinjqliu merged 1 commit intoapache:mainfrom
geruh:tellme

Conversation

@geruh
Copy link
Copy Markdown
Member

@geruh geruh commented Feb 2, 2026

Rationale for this change

Currently, PyIceberg writes one manifest per snapshot operation regardless of manifest size. In order to eventually support this we need to be able to track written bytes without closing the file, so that we can roll to a new file once we hit target size.

We had some of this work done in #650, but we can keep this simple and add writers as a follow up. The nice thing is that the underlying streams we support already have a tell() method and we just need to expose it.

With this change in the follow up we can do:

with write_manifest(...) as writer:
    writer.add_entry(entry)
    if writer.tell() >= target_file_size:
        # roll to new file

Are these changes tested?

Yes, added a test :)

Are there any user-facing changes?

No

Copy link
Copy Markdown
Contributor

@kevinjqliu kevinjqliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice! lgtm

@kevinjqliu kevinjqliu merged commit 7e66ccb into apache:main Feb 2, 2026
11 checks passed
paultmathew pushed a commit to paultmathew/iceberg-python that referenced this pull request May 7, 2026
Currently `Table.append(df)` and `Table.overwrite(df)` only accept a
materialised `pa.Table`, which forces callers to load the entire dataset into
memory before writing. This makes pyiceberg unusable for large or unbounded
inputs and has been a recurring complaint (apache#1004, apache#2152, dlt-hub#3753).

Allow `pa.RecordBatchReader` as an alternative input. When a reader is
provided, batches are streamed and microbatched into target-sized Parquet
files via the new `bin_pack_record_batches` helper, then committed in a
single snapshot via the existing fast_append path. Memory is bounded by
`write.target-file-size-bytes` (default 512 MiB) per worker rather than the
full input size.

Scope of this PR — unpartitioned tables only. Streaming into partitioned
tables raises NotImplementedError pointing back to apache#2152; partitioned support
needs additional design (high-cardinality partition handling, per-partition
rolling writers) and is tracked as a follow-up. Mirrors iceberg-go#369's
staging — that project shipped unpartitioned streaming first.

Internal note: the implementation buffers up to `target_file_size` of in-
memory RecordBatches before flushing to a Parquet file. A more memory-
efficient rolling-ParquetWriter approach is a planned follow-up that will
benefit from the `OutputStream.tell()` API added in apache#2998.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants